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Abstract 

t 

A classical random walk (St, t £ N) is defined by St '■= X n , where (X n ) are 

i.i.d. When the increments (V„)„ 6 n are a one-order Markov chain, a short memory is 
introduced in the dynamics of (St). This so-called "persistent" random walk is nolonger 
Markovian and, under suitable conditions, the rescaled process converges towards the 
integrated telegraph noise (ITN) as the time-scale and space-scale parameters tend to 
zero (see |1Q[ [T5I [TB]). The ITN process is effectively non-Markovian too. The aim 
is to consider persistent random walks (St) whose increments are Markov chains with 
\ variable order which can be infinite. This variable memory is enlighted by a one-to-one 

correspondence between (X n ) and a suitable Variable Length Markov Chain (VLMC), 
since for a VLMC the dependency from the past can be unbounded. The key fact is to 
consider the non Markovian letter process (X n ) as the margin of a couple (X n ,M„)„>o 
' where (M„)„>o stands for the memory of the process (X n ). We prove that, under a suit- 

able rescaling, (S n ,X n , M n ) converges in distribution towards a time continuous process 
(S° (t) , X (t) , M (t)) . The process (S°(t)) is a semi-Markov and Piecewise Deterministic 
Markov Process whose paths are piecewise linear. 
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1 Introduction 



Classical random walks are defined by 



S t : 



(i.i) 



n=0 



for t 6 N and for i.i.d. increments (V n ) ng pj. It is well known that a suitable rescaling of the 
random walk permits to obtain the standard Brownian motion as the time-scale and space- 
scale parameters tend to zero. When the increments (X n ) ng N are defined as a one-order 
Markov chain, a short memory in the dynamics of the stochastic paths is introduced: the 
process is called in the literature the persistent random walk or a correlated random walk or 
also a Kac walk (see [T3l \T7\ 118]). The random walk is nolonger Markovian and, under 
suitable conditions, the rescaled process converges towards the integrated telegraph noise 
(ITN), see |10^ I15| and |16) . The ITN process is effectively non-Markovian too. 

Our aim is to define processes (X n ) n( z?q with variable memory and thus to generalize this 
convergence result to random walks whose increments are higher order Markov chains. When 
(X n ) ng N is a Markov chain of finite order, it is natural to think that the limit process should 
be very close to the integrated telegraph noise. That is why we are mostly interested in 
constructing infinite length Markov chain or in dealing with Variable Length Markov Chains 
(VLMC) for which the dependency from the past is non bounded. 

A VLMC can be defined as follows (this probabilistic presentation comes from [2], other 
more statistic points of view can be found in |14| [8]). Let C = {0, 1}~ N be the set of left- 
infinite words on the alphabet {0, 1}. Consider a complete (each node has or 2 children) 
binary tree whose finite leaves C are words on the alphabet {0, 1}. To each leaf c (not 
necessarily finite) is attached a Bernoulli distribution denoted by q c . Each leaf is called a 
context and this probabilized tree is called a context tree. See for instance the simple infinite 
comb in Figure [2j the set of leaves C is defined by 



where n l represents the sequence 00 ... 01 composed with n characters '0' and one character 
'V . By convention 1 = 1. The set of leaves contains one infinite leaf 0°° and a countable 
set of finite leaves n l. The prefix function pref : C = {0, 1}~ N — > C indicates the length of 
the last run of '0': for instance, 



For a general context tree and for any left-infinite word U, we define pref (U) in a similar 
way as the first suffix of U reading from right to left appearing as a leaf of the context tree. 
The associated VLMC is the £-valued Markov chain (U n ) n ^o defined by the transitions 



where £ £ {0, 1} is any letter. Notice that the VLMC is entirely determined by the data 
q c , c £ C. Moreover the order of dependence (the memory) depends on the past itself. 

For a given VLMC (U n ) n ^o, define X n as the last letter of U n for any n > 0. When 
the context tree associated with (U n ) is infinite, then the letter process (X n ) n ^>Q is non 
Markovian, because the transition probabilities (j 1 . 2 1) indicate that X n+ \ depends on a variable 
and unbounded number of previous letters. The corresponding random walk (St) defined by 
(|1.1|) is non Markovian anymore, it is somehow very persistent, so we investigate the following 
natural questions: is the random walk of the same nature as in the one-order Markov case? 



C :={0 n l, n> 0}U{0°°} 



pref (... 1000) = 0001 = 3 1. 



F(U n+1 = U n £\U n ) = q$ Fei {Un) (£) 



(1.2) 
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Is the rescaled process convergent to some limit process? Is the limiting process analog to 
the ITN? 

Recall that X n is the last letter of a VLMC (U n ). The key point of view is the following: 
we consider the non Markovian letter process (X n ) as the margin of a couple (X n ,M n ) n >o 
where (M n ) n >o stands for the memory of the process (X n ). It is reasonable to believe that 
M n = |pref (U n )\ is a good candidate, where the notation \w\ stands for the length of a 
word w. More precisely in the particular case of a two- letter alphabet A, the Markov chain 
(X n , M n ) n >o valued in the state space A x N* is defined by the transition probabilities: let 
£,£' e A,£^£', 

Q((£,n),(£,n + lj) = l-a t , n , 
Q((£,n), (£',!)) =a i>n . 

Note that ai^ is the probability of changing letter after a run of length k of letter £, that is 

a t , k = F(X n+1 + t\X n = £, M n = k). (1.3) 

Introducing the sequence of breaking times: 

T = 0, T k+1 = inf{n > T k , X n ^ X Tk } 

it is easy to see that (X n ,T n ) n >o is a semi-Markov process (see Chapter 10] and [9j lllj). 

In Section [2J we consider a Markov chain (X n , M n ) n >o, where (X n ) n >o is a letter process, 

the letters belong to an alphabet A := {a%, 02, ... , ax}, and (M n ) n >o stands for the memory 

of the process (X n ). The state space associated with this Markov chain is {a\, a>2, • • • , a^} X 

N . We give in Section [2.21 the properties of (X n , M n ) n >o and we determine necessary and 

sufficient conditions for existence and unicity of a stationary probability measure, in Section 

12.2.21 We would like to emphasize that (X n ) is non-Markovian in general. 

In Section [3l we consider two particular cases of VLMC, associated with the simple infinite 

comb and the double infinite comb. In each of these two cases, the stationary measure can be 

explicitely calculated, in [2] for the simple comb and in the Appendix for the double comb. 

We make precise the correspondence between the process (X n , M n ) n >o defined in Section [2] 

and the VLMC (U n ) whose the last letter is X n . Namely, we establish the dictionary between 

the stationary measure for the Markov chain (X n ,M n ) n >o and the stationary measure for 

the VLMC (see Theorems 13.11 and 13. 3p . Thanks to these results, we do not have to worry 

about the point of view (couple letter/memory or VLMC) when considering the persistent 
t 

random walk St := ^X„, under the stationary regime. 

n=0 

Section U] is devoted to the study of (S n ). In particular, we determine the explicit dis- 
tribution of the r.v. S n , see Proposition 14.11 Although the result is complicated, we are 
able to determine explicitely the generating function of the r.v. SV+i, where r is a geometric 
r.v. independent from (X n , M n ) n >Q. One way to compare the process (S n ) with a classical 
random walk is to analyse how both processes fluctuate at infinity. We have the two following 
limit theorems, see Section f4.4l 

lim — = S, and y/n ( — — H ] — > A/YO, a) 
n-s>oo n \ n J 

where M(0, a) is a Gaussian distribution with mean and variance a 2 , S and a are constants 
which can be expressed in terms of the model parameters. 

Finally in Section Owe study the persistent random walk. After a convenient scaling, its 
converges towards a "generalized ITN" as proved in Theorem 15. II More precisely, we focus on 
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the limit in law of Markov chains of the type (X^,M^) which depends on a small parameter 
£ > 0. We suppose that X s , takes its values in {—1, 1}, X £ = 1 and 

p(x* +1 = l\x s n = -1, Ml = k) = h{ke)e + o{e) (1.4) 

K+i = -l|^n = 1. K = k ) = h(ke)s + o(e) (1.5) 

where /i,/2 : [0, oo[— >• R are non negative and right continuous functions. Note that (|1.4p 
and (jl.5p mean that (X s ,) has a conservative behaviour: if X s , = 1 (resp. = — 1) the 
probability that X^ +1 changes, i.e. X^ +1 = — 1 (resp. X £ +1 = 1) is small for convenient 
fi, /2 and is measured by the parameter e. 

Under additional assumptions, see the beginning of Section[5]for details, it is actually possible 
to rescale the triplet (X 6 ,, M £ , S^) so that it converges as e — > 0. For simplicity, we only 
present the scaling procedure concerning S^. The process (S £ (t), t > 0) is piecewise linear 
and satisfies 

S £ {ne) = eS £ n , for any n G N. (1.6) 

We prove (see Theorem 15.11 for a more complete result) that (S e (t), t > 0) converges in 
distribution, as e — > to (S°(t), t > 0) where 

Jo 

Here, (N°(t)) is the counting process with jump times (£n)n>o : 

n>0 

where (£ n +i — £ n , w > 0) is a sequence of independent r.v. such that £o = and 

P(6n+1 " 6n > t) = exp ^- jf f 2 (u) dv\ 

IP(6n+2 - 6n+i >t)= exp f- y /i(n) du\ 
for any i > and n > 0, where /i, /2 satisfies (|5.49p . 

The process (S°(t), t > 0) is called the Generalized Integrated Telegraph Noise (see |10| for 
the ITN). It is both a semi-Markov process and a Piecewise Deterministic Markov Process 
[5j [H 0] and its trajectories look like a zig-zag. 

2 The Markov chain (X n , M n ) n > 
2.1 Definition 

Let us consider the finite set A = {a\, . . . , ax} with K > 1 elements. To each a, is associated 
a sequence (cti tn ) n >i G]0,1[ N * where N* = {1,2,3,...}. We can now introduce the Markov 
chain (X n , M„) ne ^ valued in the state space {oi, . . . , ax} x N* with transition probabilities 

Q((ai,n), (oi,n+ 1) ) = 1 - a in , 

Q( (ai,n), (a,-, 1)) = a ijTl pij, l<i^j<K, n > 1, 
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Figure 1: A path description of the process (X n , M n ) n >o 

where V := (pij) is a given K x K transition matrix satisfying p^j = for all 1 < i < K, 
Pi j > for all i ^ j and Yl!j=\Vij = 1 f° r an *• 111 fact, pjj is the probability to move from 
di to dj knowing that we leave <Zj. 

Moreover, in order to deal with VLMC later on, we extend the definition of the Markov chain 
to the state space {a\, . . . , ax} X N with N = N*U{oo}. Therefore we introduce aj j00 G]0, 1[ 
for all 1 < i < K such that 



Q( (ai, oo), (ajjOoH = 1 - a ii00 



(2.2) 



Q^(ai,oo), (a,-,l)) = a ii00 pij, i ^ j. 

Note that is the probability of changing letter after a run of length k of dj, that is 

a ijfe = P(X n+ i / ai\X n = ai , M n = k). (2.3) 

There are strong links between (X n ) and (M n ). In particular, if Mq = 1, M n can be expressed 
with Xq, . . . , X n . Indeed, if the sequence (Xj)j=o n is constant then M n = n + 1 and 
M n = inf{l < i < n; A" n _j ^ X n } otherwise. In other words, one has 

M n = 1 + sup{0 < i < n, X n _j = X n , Vj G {0, . . . , i}} (2.4) 
= inf{0 < i < n, X n _i + X n }. 

Let us explain how moves (X n ,M n ) in the case Mq = 1 and Xq = ai. The variable M n 
increases by one unit at each time until X n switches to aj ^ Oj. At that first jump time, the 
memory is reset to 1 and so on... So that M n represents the variable memory of (Xf)o<t<n 
since it counts the last consecutive stays (at X n ) before n. Moreover the dynamics of the 
jumps of X n is governed by the value of M n . In Figure [H we have drawn the following 
trajectory of (X n , M n ) corresponding to the values a\, a%, a-i, a^,, a-ii «2> «4> «4> «3, ^3, ^3, ^3 • • • 
of X n . 

Let us note that in the particular case: K = 2, a± = 0, ai = 1 and ay n = a,- for all n > 1, 
then (X n ) n >o is a sequence of independent Bernoulli random variables. 

2.2 Properties of the Markov chain (X, M) 

First we investigate under which conditions either (X n ) or (M n ) is Markov. Secondly we 
prove existence of invariant probability measure and finally we present a path description for 
the process (X n ). 
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2.2.1 Link between the margins 

A natural question arising about a 2-dimensional Markov chain (X n , M„) ng N is to know 
whether the margins are Markovian too. The following proposition says that in general case, 
neither (X n ) ng pj nor (M n ) n gpj is a Markov chain. 

Proposition 2.1. Assume that Mq = 1. 

(i) The margin process (X n ) 

n^N is Markovian if and only if for all 1 ^ i ^ K 7 n i y OH.n ^ 
constant. In that case the transition matrix Q x of X is given by: 

Q x (iJ) = { 1 ~ ai > 1 V- , ' ( 2 - 5 ) 

(ii) The margin process (M n ) n£ N is Markovian for any initial condition Xq if and only if 
for all n > 1, the function i i— > aj n is constant. In that case, the transition matrix Q M 
of M is 



Q M (n,j) 

Proof. 



l-ai,n ifj = n+l 
"i, n if 3 = !• 



(i) For a given vector (xq, . . . , £ n ) G {ai, . . . , ax} n+1 , let us first denote 

Si,n '■= ^{Xn+i = Oi\X n = x n , . . . , Xq = Xq). 
According to (|2.4p let us introduce: 

m n = 1 + sup{0 < i < n : x n -j = x n , Vj G {0, . . . , i}}. (2.6) 
We have to distinguish two cases. 

(a) If x n = ai then M n = m n and therefore <5j jn = 1 — anm„ . We can choose different 
values of X2,---,x n _i such that m n = 1,2, ... ,n. Hence if (X n ) is Markovian then 
&i,n is independent of n and (xo, • • • , x n — \ ) and (a i:k ) k >i is constant. 

(b) If x n = aj j=- ai then 5^ n = ay, mn pjj = o^i pj ti implying that (X n ) is effectively 
Markovian. 

(ii) Let us study the process (M n ) n >o- Set 

d i>n := P(X = i, M = 1, M x = 2, . . . , M n = n + 1, M n+1 = 1). 

We have 

d i}Tl = F(X = i, M = 1, X x = i, Mi = 2, . . . ,X n = i, M n = n + 1, M n+1 = 1). 

Since (X n ,M n ) is a Markov chain, using (|2.ip and (|2.3p we get 

<ii,n = (1 - «i,i) x ... x (1 - a itn )a itn+ x. 

Suppose that (M n ) is a Markov chain, with transition matrix Q . Then 

d t , n = Q M (n + 1, 1)F(X = i, M = 1, . . . , M n = n + 1) 
= Q M (n + 1, 1)(1 - a,,i) x ... x (1 - a i>n ). 

Consequently, ai in +i = Q M (n + 1, 1) in independent of i and thus aj jn +i = ai jn +i for 
all i and n. 

As for the converse, since i \— > ai tH is constant, it is clear that (|2.ip implies that (M n ) 
is Markov. ■ 
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Remark 2.2. The one- dimensional memory process (M n ) ng ^ could be replaced by a K- 
dimensional process. For each state Oj, define 

Mt ] = mf{0 < k < n; X n _ k + aj. 

There is a one-to-one correspondence between (X n ,M n ) and (Ai^ 1 ^, . . . , M.^h K ^). Conse- 
quently the vector memory (Ai^ 1 ^, . . . , AAn ) is a Markov chain. For instance (X n , M n ) = 
(as, 4) corresponds to (M^, . . . , Ain ) = (0, 0, 4, 0, . . . , 0). Indeed if the k th coordinate of 
the vector does not vanish then X n = a&. This permits to recover X n via (Ain 1 i ■ ■ ■ i-M.^ 11 ^), 
as for M n , we have M n = Ai Xn . 

2.2.2 Invariant probability measure for (X n , M n ) n > 

Let us now investigate the existence of an invariant probability measure. It is convenient to 
introduce for all 1 < i < K, 

n—l 

e < : =Z) IK 1 -<*.*)' ( 2 - 7 ) 



n>l fc=l 



and for m > 1, 



m— 1 

Pi(m) := ij(l-Oi,fc), (2.8) 

fc=i 

with the convention Y\i = 1. 

Vi(k + 1) represents the conditional probability that the process (X n ) stays at least a time 
interval of length k in the same state i 

Vi(k + 1) = P {X x = ■ ■ ■ = X k = i\X = i, M = 1) • 

Proposition 2.3. Let us denote V = (pij) a given irreducible transition matrix. 

(i) Then the Markov chain (X n , M„)„>o with transition probabilities defined by (|2. lj) and 
(|2.2p admits a invariant probability measure v on the space {a±, . . . ,ax} x N if and 
only if Q\,...,®k defined by (|2.7I) are all finite. This invariant probability measure is 
unique. 

(ii) Moreover, if we denote by v* the unique positive vector associated with the largest eigen- 
value of V = (pij) by Frobenius's theorem, then v(ai,od) = for all 1 < i < K and 
n>l, 

v* 

where e = t {e 1 ,...,Q K ) and (Q,v*) =E£i i<- 

Remark 2.4. The invariant measure v can be decomposed in the following way: for 1 < i < 
K and n > 1, 

v(a.i,n) = v x (ai)vi(n), (2.9) 

where 

X, \ ®i v i j , \ ^i( n ) 

v K)= (e^) and ^ (n) = -eT- 

If (Xq,Mq) ~ v, then, for any n > 1, u x is the law of X n , and U{ is the conditional 
distribution of M n , given X n = i. 

Let us consider the particular case when for all 1 < i < K and n > 1, 

1 — Oiin = with pi > 0. 

n 
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After straightforward calculations, we obtain 0, = e pi and 

(B, v*) (n — 1)! 

In other words, if (Xq, Mq) ~ v then the distribution of the couple (X n ,M n ) can be described 
as follows: X n is chosen first with the probability u x and afterwards, conditionally on X n = 
ai, M n is Poisson distributed with parameter p^. 

Proof of Proposition 12.31 

For notational simplicity, we shall fix a, = i for all 1 < i < K. 

Step 1 — Invariant measure: Let v be a non-negative measure. Since (X n , M n ) is valued 
in the state space {1, 2, 3, ... , K} x N , v is an invariant measure if and only if 

"(*', *0 = E ^ Q ((*' & + E (*'> k ">) } 

= u(i,k- 1)(1 - a iife _i)l {fc> i } + l{ fc= i } E^E 1 '^)"^' (2.10) 
for any 1 < i < K, and 

z/(«',oo) = z/(i,oo)(l - a ij0O ). (2-H) 

Obviously (f^TTT]) implies that i/(i,oo) = for all 1 < i < K. Relation ([ZIP]) , with fe > 2 is 
equivalent to z/(i, A;) = v(i, k — 1)(1 — a^fc-i) which implies for k >2 

fe-1 

*;) = i) - °v) = "(«, i)^(fc)- ( 2 - 12 ) 

r=l 

The particular situation k = 1 in (|2.10p and ([2.12D leads to 



>(i,i) =Y J po,i{Y^ a ^ v ^y^ 1 )- ( 2 - 13 ) 



Using ([53211 and (E2D we get: 



K 



l<i<Jf, n>l i=l n>l i>l 

Finally ^ is a probability measure iff v(i,k) is given by (|2. 12|) for any k > 1, the vector 
1), • • • , 1)) solves (I2J3D and 



^i/(i,i)ei = i. (2.i4) 



i=l 



Step 2 — Necessary condition: Assume that 

Gi<oo, Vi€{l,...,K}. (2.15) 
Writing ctj i = —(1 — atj i) + 1, we develop the expression (|2.13p using (|2 . 15|) : 

u(i, i) = J2^{ E - v ^ + *)} "fr x ) 

i^i £>1 
= E^ "(j. 1 )- 
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The vector v := 1), v(2, 1), . . . , v(K, 1)) satisfies 

v = t Vv, with V = (pij). (2.16) 

Let v* be the unique positive vector associated with the largest eigenvalue of V = (pij) by 
Frobenius' theorem, then there exists A > such that 

(v(l,l),v(2,l),...,v(K,l)) =AV. 

Using (|2.14p we deduce: 

K K 

5>(i,i)ei = \ 1 £e i v* = \(@,v*). 

i=l i=l 

Hence A = l/(©, V*) and by (|2.12p . v is determined by u(i,n) = - 'Pi(n), which gives 

\"j u / 

existence and unicity of v. 



Step 3 — Sufficient condition: Conversely let us assume the existence of an invariant 
probability measure v. We shall prove (|2.15p . Obviously (|2.13p implies that if 1) = 
for some i, then u(j, 1) = for all j. Therefore v = which contradicts the fact that v is 
a probability measure. Hence 1) > for all 1 < i < K . It is clear that (|2.14p implies 

dSS}. ■ 

Since the Markov chain (X n , M n ) ng ^ admits an invariant probability measure, we can extend 
its definition to Z (instead of N) such that it is stationary. This extension will be usefull to 
connect with certain Variable Length Markov Chains (defined later in Section [3]) . 

Remark 2.5. Since v is the invariant probability measure of the Markov chain (X n , M^n^^ 
then (X_ n ,M_ n ) ne ^ is a Markov chain with invariant probability measure v and transition 
probabilities Q where: 

u(x)Q(x, y) = v(y)Q(y, x), Vx, y G {a\, . . . , a K } x N*. 
From (|2.ip and Proposition \2.3\ we easily obtain 

Q((ai,n + 1), (aj,n)) = 1, Vi G {1, . . . ,K}, n > 1, 
Q ((aj, 1), (a i5 n)) = ^ PijO>i,nVi(n), i ^ j, n > 1. 

2.2.3 Paths description of X 

From now on, for notational simplicity, we only consider the case K = 2. The trajectory 
n I— > X n is determined as soon as the transition times between the different states are known. 
Let us define Tq = and the sequence of stopping times for n > 1, 

T n = inf {» > T n _! : ^ / }. (2.17) 

Proposition 2.6. (%) Lei us assume that ©i and @2 defined by (|2.7p are finite. Then the 
random variables {T n+ \ — T n ) n >i are almost surely finite and independent. 
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(ii) (a) If Xq = a 2 and Mq = m > 1. Then for all i > 1 and n > 1, 

m+i—2 

P(Ti = i) = a 2 , m +i-l [J (l-a 2j ), (2-18) 

and 

P(T 2n+1 - T 2n = i) = a 2 ,iV 2 (i), (2.19) 
P(T 2n - T 2n _! = i) = a h iPi(i). (2.20) 

f&J 7/X = ai and M = m > 1 tfien (f2~T8|) and (f2TT9|) (Vesp. (f2~20]) ) are still valid 
after replacing (a 2i ») by (ati t ,) (resp. (ai,») 6|/ (a 2 ,«)J. 

Remark 2.7. i. ./Vote that, if Xq = a 2 and Mq = m then for all n > 1, 

m+i—2 

P(Ti>i)= JJ (l-a 2j ), P(T 2n+1 -r 2n >i)=7 3 2 (i) 
j=m 

and P(T 2n -T 2n „! > i)=V 1 {i). 

2. Between two consecutive jump times, the memory increases linearly 

M Tn+t = l + t, 0<t< T n+1 -T n , n> 1. (2.21) 

Proof of Proposition 12.61 

Let us consider Xq = a 2 and Mq = m. Then 

P(Ti = i) = P(Xi = 02, X 2 = a 2 , . . . , = 02, ^ = oi). 

Using the Markov property, we deduce 

m+i—2 

V(Ti=i)= J] Q((a 2 ,j),(a 2 J + l))Q((a 2 ,m + i-l),( ai ,l)). 

j=m 

Equation (|2.18p is therefore a direct consequence of f|2. lj) . 
Using a 2jm+ j_i = 1 — (1 — a^m+i-l)) it is easy to deduce that 

m+i—2 

^/>2.m-; 1 } [ (1 - a 2 j) = I. 
i>l j=m 

This shows that P(Ti < oo) = 1. 

Morever, conditioning by Xq = a 2 and Mo = m, for j > 1 and i > 1, one has 

P(T 2 -T x = j,Ti = i)=F{X 1 = ... = X i _ 1 = a 2 ,X i = ...= X j+i ^ = a h X j+i = a 2 ) 

m+i—2 j — l 

= Ot 2t m+i-l Yl C 1 ~~ a 2j) JJ(1 - Oil,i)oil,j 
j=m 1=1 

= F(T 1 = i)F(T 2 -T 1 =j), 

which leads to the independence between T\ and T 2 — T\. The independence of T3 — T 2 
and (T±,T 2 — T\) can be proved similarly. The proof of (ii) (a) of Proposition 12.61 follows by 
induction. The proof for (ii) (b) is analog. ■ 
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Figure 2: Infinite simple comb probabilized context tree. 

3 The variable length Markov Chain (U n ) n >o 

In this section, the relation between the Markov chain (X n ,M n ) n ^z valued in {0,1} x N* 
and the VLMC (U n ) n >o introduced in Section [1] is highlighted by the Theorems 13.11 and 13,31 
For two very particular variable length Markov chains, we prove that these two models are 
equivalent. We consider two cases of VLMC for two specific context trees: the simple infinite 
comb and the double infinite comb. 

From now on and until the end of this paper, for the sake of simplicity, we only consider the 
case K = 2. 

3.1 The simple infinite comb 

Let us consider the alphabet {01,02} with a% = and 02 = 1. We associate with a Markov 
chain of type (X n , M n ) n£ i defined in Section [2] a unique VLMC and vice versa. Abusing 
words, this VLMC is called the infinite comb. We refer to [2] for a complete definition. It 
is proved in |2] that in the irreducible case i.e. when (/o°°(0) 7^ L (U n )n>o has a unique 
stationary probability measure tt on the set of left-infinite words C if and only if Oi is 
finite. Similarly, if Q\ < 00, Proposition 12.31 implies that (X n , M n ) n ^z has a unique invariant 
probability measure. The following theorem enlights the links between the VLMC (JJ n ) n >o 
and the chain (X n , M n ) ng z and their respective stationary probability measure. 

Theorem 3.1 (infinite comb). 

(i) Let (X n , M n ) n £i be a stationary Markov chain valued in {0,1} x N* 7 with transition 
probabilities (|2.ip and (|2,2p . with p\^ = P2,l = 1- We suppose @i < 00 (where Q\ is 
defined in Ii2. 7\) ) and Vn G N* ; 

02,™ = a> 2 - (3.1) 

We define for all n£N, 

U n = ... X n ^iX n -\X n . (3.2) 

Then, (U n ) n >o is a stationary variable length Markov chain associated with the infinite 
comb with 

<?i(0) = a 2 , <?o™i(l) = ai,n, ?o°°(l) = «i,oo- (3.3) 
The initial distribution is given by Uq ="*... V_2^-i^o- 

(ii) Conversely consider a stationary VLMC (U n ) n >o satisfying ()3.3j) . For n > 0, define X n 
as the last letter ofU n and (M n )„>o as in (|2.4p . Then Q\ < 00 (where 0i is defined in 
h2. 7\) ) and (X n , M n ) n >Q is a stationary Markov chain with transitions (|2.ip . fj2 .2|) and 
(|3.3|) and initial data (Xq,Mq). A stationary Markov chain (X n , M n ) ne z can therefore 
be defined using the classical procedure of extension from N to Z. 
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The following tabular resumes the correspondence between these two models and could be 
considered as a dictionary (in the case: a± = and 02 = 1). 





(t^n)n>0 


V 


7T 


Q ((0*, k), (04, k + 1)) = 1 - c^fc 


9a fe a,( a i) for j 


Q((ai,k), = Q ijfc 





Proof. 

(i) Due to Definition (|3.2p of the process (U n ) n >o, for all s G {0, 1}, the events {£/ n +i = 
U n s} and {X n+ i = s} are equal. Therefore (U n ) n >o is a Markov chain as soon as 

5 StU := P(C/ n +i = U n s\U n = u)= P(X n+ i = s|X n = u , . . . ,X n _ fe = u_ fc , . . .) 

only depends on s G {0, 1} and u, where u = . . . u^\Uq G {0, 1}~ N . 

Suppose first that uq = 1. Since M n G N*, (02 n )n>l i s constant and pref (u) = 1, then 
AO and (H3D imply that 



<5 S ,« = (1 - a 2 )t{ s= i} + a2l{ s= o} = qi(s). 

Let us now consider the case no = 0. Recall (see Proposition 12. 3p that M n G N*. 
Consequently, there exists m G N* such that u = . . . 10 m . Then M n = m, pref (U n ) = 
m l and 

P(X n+ i = s\X n = 0,M n = m, ...) = (1 - ai im )l{ s=0 } + ai, m l {s=1 y = q mi(s). 

Next, we prove that (U n ) n >o is stationary. Note that (|3,2p yields U n = ip ((X n _j) i>0 ) 
a.s. where ^ ((x_ n ) n>0 ) = . . . x^X-ix^. Therefore, for any £ > 0, 



E 



/( U n+ i_g, . . . , C/ n+ i 



E 



f[lp((Xn+l-£-i)i>o), ■ ■ ■ ,1p({Xn+l-i)i>o) 
(d) 



Since (X n ,M n ) n& is stationary, then (X m+ i_i)j> = (X m _j)j> for any m G Z. This 
implies that (£/ n ) n >o is stationary. 



(ii) Let us now assume that (f7 n ) n >o is a stationary VLMC. Let x,x' G {0,1}, k,k' > 1, 
n G N and 

5' := P(X n+1 = x', M n+1 = k'\X n = x, M n = k, . . .). 

Then 



5' = 1 r fc , =1 x ¥(U n+1 = U n x'\U n = ... x'x k ) + 1, fc , =fc+ nP(?7„ + i = tf n x|£/ n = ... (1 - x)x k ) 



l !h'=i\ 

\ x^x' J 

l{fc'=l} 



r fc'=fc+i \ 1 

^ x — x 1 J 

1{ .=0 }P(C/ n+ l = I7nl|^n = • • • 10*) + 1{ ,=1 }P(C/„+l = tf n 0|C/ n = . . . 01 fc ) 



+ i{fe'=fc+i} a { ^=i}P(c/n+i = u n i\u n = ... oi k ) + i {x=x , =0} nu n+1 = u n o\u n = ... io k ) 



i{fc'=i> 



51 (l) + lr rl|gi (0) 



{k'=k+l} 



1L{x=x'=l}qi(l) + l{a;=a:'=0}<?0 fe l( ) 



Using (|3.3p we get 



5' = l{fc' = l} 



+ l{fc' = fc + l} 

Then (|2.ip follows directly with 012, n = Q ; 2- 



l{ 3; = a; '=i}(l-a2)+l{x=a;'=o}( 1 -"i,fc) 
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The following result is a corollary of Proposition ^. 31 It enables us to compare the expression 
of the invariant measure v from the model (X n , M n ) n£ z with the invariant measure n for the 
VLMC infinite comb (see Section [B] in the Appendix for notations about VLMC). 

Corollary 3.2. Under the condition 0i < oo, there exists a unique invariant probability 
measure v for the Markov chain (X n ,M n ) given by u(a±,oo) = z^(a 2 ,oo) = and for all 
m > 1, 

1 a 2 (l-a2) m_1 

i/(ai,m) = Vi{m) and u(a 2 ,m) = — — - , (3.4) 

Bi + 9 2 l + a 2 Bi 

where Oi = \ja%. In particular one gets 

1 



v(a 2 ,N*) = vr(a 2 ) 



1 + a 2 ©i 
PROOF. Proposition 12.31 with 

v = {\ I) -^- = 1(1.1) 



lead to (E31) and 



i/(a 2 ,N*) = V v(a 2 ,m) = 1 

^— ' 1 + a 2 Bi 



m>l 

Consequently one has 

1 1 

i/(a2,N*) = — = — j = vr(a 2 ), 

1 + 9i (0) En>i FIL^ 1 - 9o*i(l)) En>o ]lfc=o 9o ft i(0) 
which is fortunately (!) the invariant measure obtained in j2]- ■ 

3.2 The double infinite comb 

Let us now present the double infinite comb. Consider the probabilized context tree given 
on Figure [3] (hereafter called double infinite comb). In this case, there are two infinite leaves 




9o°° x / 9i°°o 



Figure 3: infinite double comb probabilized context tree. 

0°° and 1°° and countably many finite leaves n l and l n 0, n £ N, so that 

C = {0 n l, n > 1} U {l n 0, n > 1} U {0 00 } U {1°°}. 

The data of a corresponding VLMC consists thus in Bernoulli probability measures on {0, 1}: 

g °°,9i°°, and 9o n i>9i n o> n £ N*. 

We refer to Appendix [B] to see that the finiteness of @i and 2 implies the existence of a 
unique invariant measure for this VLMC. 
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Theorem 3.3 (double infinite comb). (i) Let (X n , M n ) n& z be a stationary Markov chain 
with transition probabilities ()2,ip and (|2.2p . We suppose Gi < oo and O2 < 00 (where 
@i is defined in \2. 7\) ). Then the process (U n ) n >o defined by fj3.2j) is a stationary variable 
length Markov chain associated with the double infinite comb with 

gin (0) = a 2 ,n, tfl°°(0) = «2,oo, 90 n l(l) = «l,n, 90°°(l) = «l,oo- (3.5) 
The initial data is given by Uq = . . . X^X-iX^. 

(ii) Conversely let (U n ) n >o be a stationary VLMC satisfying (13. 5p . For n > 0, define X n 
by the last letter of U n and (M n ) n >o as in (12. 4p . Then (X n , M n ) n >o is a stationary 
Markov chain with transitions (I2.ip . (|2.2p and (|3.5[) and wif/i initial data (Xq,Mq). 
This stationary Markov chain can be extended on the time space 7L as usual. 



The arguments for the proof are similar to those presented in Theorem 13.11 

As for the simple infinite comb, the invariant measure of the first margin of the Markov chain 
(X n ,M n ) n £z corresponding to the double infinite comb can be compared with the invariant 
measure ir for the VLMC double infinite comb calculated in Appendix [Bj 

Corollary 3.4. Under the condition @i < oo and ©2 < 00, there exists a unique invariant 
probability measure v for the Markov chain (X n ,M n ) given by v(a\, 00) = 1/(02,00) = and 
for all m> 1 and i = 1,2, 

v{a h m) = Vi(m). (3.6) 

"1 + "2 



Consequently one gets 



K«2,N*)= ® 2 =7T(a 2 ). 
fc»i + fc»2 



PROOF. Again (|3.6p is a direct consequence of Proposition 12.31 Suming up it comes 

e 2 



^(a 2 ,N*) 



6i + © 2 ' 



with 



and 



n— 1 n 

@ i = e n^ 1 - 9on(i)) = e n ^1(0) 

n>lfe=l n>0fc=l 



n-1 n 

2 = E " «i fe o(0)) = E II «i fc o(!)> 

n>lfc=l n>0fc=l 



which is exactly the calculation of 7r(a2) in Appendix [Bj 

Remark 3.5. The results developed in Theorem \3. 1\ and Theorem \3. °3\ can be generalized to 
context trees which are based on a finite alphabet {a± , . . . , ag} and composed with a finite 
number of combs. The corresponding Markov chain (X n , M n ) n ^z is then valued in the state 
space {a\, . . . , a^} x N . 

Of particular interest are variable length Markov chains (U n ) associated with the infinite 
comb or the double infinite comb. While the sequence (X n ) formed by the last letters of 
the process U n = ...X n —\X n is not a Markov process, except for very particular q c , the 
previous theorems show that it suffices to add a memory process (M n ) to get a Markov chain 
(X n ,M n ). Note that (U n ) takes its value in the non-countable space C and Theorems 13.11 
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and 13.31 allow to associate by a one to one correspondence a Markov chain (X n , M n ) which is 
valued in the countable set {0, 1} x N . This reduction of the size of the state space (which 
becomes here minimal) is made possible by the particular shape of the context tree: for 
instance, the VLMC associated with the bamboo blossom defined in [2] is not equivalent to 
a Markov Chain (X n ,M n ) with a real memory process. Nevertheless for suitable VLMC we 
suggest to introduce the following application 

{U n )n !-»■ (pref {U n )) n , 

which should permit to generalize the reduction of the state space. The image process is not 
Markovian in the general case, even under the stationary distribution for U n . A conjecture: 
the process (pref (U n )) n is Markovian (and thus defines an automaton) if and only if the 
associated context tree has a completeness property, studied in a companion paper pQ. 



4 Distribution of the persistent random walk 

By definition, a random walk (5 n ) n >o is a process whose increments are independent. It is 
often pertinent, for instance in modeling, to begin with the increments and second to study 
the associated random walk. Let us give an example coming from finance. Suppose that S n 
is the price at time n of an asset. In the Cox, Ross and Rubinstein model, the non-arbitrage 
condition implies that the relative increments (^ Sn g Sn ~ 1 i n > lj are independent. 
We study here a class of additive processes (S n ) of the type 

n 

S n = ^X k , n>0, (4.1) 
k=0 

where the increments (X n ) are not independent. A tentative of considering increments with 
short dependency has been already developed in |15] and [16J. In these studies, the authors 
have supposed that (X n ) is a Markov chain. We would like to go further here introducing 
variable length memory between the increments. 

We consider in this section, a Markov chain (X n , M n ) n >o with transition probability (I2.ip 
and ([2.2p and we assume 

K = 2, CL\ = —1, (22 = 1. 

The process (S n ) defined by (|4.1I) is called a persistent random walk. This terminology comes 
from [7]. 

A path description of (S n ) is given in Section l3~l~1 puting ahead the breaking times (T n ) n >i. 
We give in Section \4. 21 the explicit distribution of S n . Although the law of S n is complicated, 
we can determine explicitely the distribution and the generating function of the position of the 
persistent random walk at an exponential independent random time. The double generating 
funtion will play an important role in Section [5] 

We end this section studying how S n fluctuates as n — > oo. Indeed it is not so far from the 
persistent walk with one-order Markovian increments. We prove a law of large number and 
a central limit theorem. We recover the classical setting where (X n ) is a Markov chain. We 
have introduced variable memory to (X n ), but it seems that it is not sufficient to obtain new 
asymptotic behavior: it would be therefore very interesting to investigate the behaviour of 
the random walk when mixing assumptions are relaxed, i. e. when the length of the memory 
increases significantly to give a real persistent memory effect to the random walks. 
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4.1 Paths description 

Since X n is { — 1, l}-valued, it is clear that the trajectory of (S n ) n >o is a sequence of straight 
lines with slopes ±1, and the instants of breaks are (T n ) n >i which were introduced in (|2.17p . 

Let us assume that So = Xq = 1, then the trajectory increases step 1 by step 1 till T\ — 1 
where it reaches a first local maximum. After that time, it decreases and reaches a local 
minimum at time Ti — 1 and so on. The trajectory of (<SVi)neN corresponds to the linear 
interpolation between the sequence of points (W n , Z n ) n >o where Wq = 0, Zq = 1 and for 
n> 1, 



(W n , Z n ) = (T n - 1, S Tn - (-1)") = I T n - 1, > Y-l)*- 1 ^* - T fe _! 



fc=i 



If S = X 



-1, then the behaviour of the process is similar and reduces on a succession 




> n 
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X l = . . . 



X 8 = 11(-1)(-1)(-1)(-1)11 X 1 . . .X 7 = (-l)llll(-l)C-l) 

Figure 4: Trajectories of (St)t>o when either So = 1 or 5o = — 1. 

of increasing and decreasing parts. The trajectory (St)t>o is a linear interpolation between 
(W n , Z' n ) n >o where Wo = 0, Z = — 1 and for n > 1, 

(w n , = (r n - 1, 5 Tn + (-id = (r n - i, E^" 1 )"^ " T *-i)) • 

Note that Z' n = —Z n . 

Let us introduce the counting process {Nt)te_n whose jump times are T n : 

N t = sup{n > 1 : T n < t} = ^ l{T„<t}, t G N 

n>l 

From now on, we suppose that (Xo, Mo) = (1, 1). Note that the case (Xo, Mo) = (—1, 1 
be deduced from the former case changing X in —X. 
The counting process (Nt)t>o will play an important role in the study of (Sj)t>o ( see Section[5|) 
and (St)t>o can be- expressed via (Nt)t>o as: 

S t = ^(-1)^. 

n=0 

There is a one-to-one correspondence between {Mt)t>o an d (?n) n >o: 

{^; M fc = 1} = {T n ; n > 0}. 



(4.2) 



can 



(4.3) 



(4.4) 
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{N s ; s < t} can be expressed via {M s ; s < t} and vice and versa. Indeed, (|4.2I) (|4.4j) and 
d23|) imply 

i 

JV t = l{M fc =i} and M t = 1 + sup{n > : iVt_ n = N t }, t E N. (4.5) 
fc=i 

4.2 Distribution of the persistent random walk at a fixed time 

In this section we give the explicit distribution of the persistent random walk at any fixed 
time. 

We recall that (X n , M n ) n >o is a { — 1, 1} x N -valued Markov chain with transitions matrix 
Q defined by (|2.ip and starting values (Xo,Mo) = (1, 1). Therefore the law of (X n ,M n ) is 
given by Q n . However the calculation of Q n is untractable. This leads to restrict ourselves 
to the law of X n . 
Let us define 

W(m, b) := |w G (N*) m : ui + . . . + u m = b\, m > 1, b > 1. 

and 

Aj(m,6)= Pi(«i)...?i(« m ) ai )U1 x...xa iiUm , £ = 1,2 (4.6) 

w6A/"(m,6) 

with i4j(m, &) = for < 6 < m and Aj(0, 6) = t{b=o}- 

The distribution of the random walk (S n ) n >i can be directly linked to the occupation measure 
L n (l) of the increments (X n ) n >i in the following way: 

Proposition 4.1. (distribution of S n ) Suppose that (Xq,Mq) = (1,1). 

(i) Let us introduce the local time 

n 

L n (l) :=^l {Xfc=1} , (4.7) 

k=l 

then the random walk satisfies for n > 1, 

S n = 1 + 2L n (l) - n. (4.8) 
Consequently, for any < k < n: 

Vn(k) := P(S n = 1 + 2fc - n) = P(L„(1) = fc). (4.9) 

(ii) Moreover, for < k < n, we have 

Vn (k)=r 1 ^(k) + n^(k) (4.10) 

with 

n—k—m+l 

T)W(k)= Yj A 2 (m,k + l) ^ Ai(m-l,n-k-£)Vi(t) (4.11) 

l<m<(fc+l)A(n-fc) <=1 

fc— m+1 

^ 2) (*0 = X] ^(m.n-fc) A 2 (m J fc + l-^)7'a(^). (4.12) 

0<m<kh{n-k) £=l 
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Proof. 

(i) Using the definition of L n (l), it comes 

n 

5« = l + X)l{X i =l} 

i=l 
n 

= i + J> { x i =i } 

i=i 

= 1 + 2L n (l) - n 

(ii) In order to compute r/ n (k), it is convenient to use the family of stopping times (T n ) 
introduced in (|2,17|) . The probability of the event {L n (l) = k} can be decomposed into 
two parts, according to the fact that time n arrives on the way up or on the way down: 

Vn {k) = £ 7#>(fc,m) + Y, V^(k,m) (4.13) 

m>0 m>l 

where 

rjM(k,m) := p(i„(l) = k, T 2m <n< T 2m+1 ), m > 

and 

V { n ] (k,m) :=p(L n (l) = fc, T 2m _! < n < T 2m ) , m > 1. 

First step — Computation of r]^\k,m) for n > k. Suppose first that m > 1. 
On the set {L n (l) = fc, T 2m < n < T 2m+ i}, we define for < i < m, the length of 
the i th ascent W{ := T 2 j + i — T 2 j, W m := n + 1 — T 2m and the length of the i th descent 
Vi := T 2 j — T 2 j_i for 1 < i < m. Then 

W + Wi + • • • + W m + Vi + . . . + V m = n + 1, W + W x + . . . + W m = k + 1. (4.14) 

Therefore for W = (W , . . . , W m ) and £ = (Vi, . . . , V m ) we get 

4P(k,m)= Yl F(W = w,V = v). (4.15) 

weM(m+l,k+l) veJ\f(m,n—k) 

Using the distributions of T 2 .; + i — T 2 j and T 2 j +2 — T 2 i + \ given in Proposition 12.61 we 
obtain 

F(W = w ,V = v) = V 2 {wi)a 2 , Wl Vi(v 1 )a 1)Vl x ... 

V 2 (u m+ i). (4.16) 

It is clear that ([4~T5]) and (pTTrjj) imply 

r^\k,m) = A 2 (m + 1, k + l)Ai(m,n - fc), (4.17) 
where Ai is defined by (|4.6p and for m > 2 and i £ {1, 2}, 

(4.18) 

u!eA/"(m,6) 

and 2i(l,6) = Vi(b). 

If m = 0, then n = k, r]n^ (k, m) = V 2 (n + 1). Therefore (|4.17|) holds with m = 0. 



-i} 



i=l 



n 



n-Y^{X l= i} 

\ i=l t 
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Step 2 — Computation of r]^(k,m). Similarly, define on {L n (l) = k, T 2m _i < 
n < T 2m }, Wi := T 2 i + i - T 2i for < i < m, V\ := T 2i - T 2 i-\ for 1 < i < m and 
V m := n + 1 - T 2m _i then: 

W + Wi + • • • + W m _i + Vi + . . . + V m = n + 1, W + Wi + . . . + Wm-x = k + 1. 

For W = (W , . . . , W m _i) and Z = (Fx, . . . , F ro ) we get 

r 1 ( g\k,m) = ^ Yl P{W = w,V = v) 

w(^M{m,k+l) vdN (m,n—k) 

= A 2 (m,k + l)A 1 (m,n-k). (4.19) 
Combining KW . (H~TT1) and (14TT9]) leads to 

fcA(n-fc) (fc+l)A(n-fc) 

Vn(k) = Y A 2 (m + l,k + l)Ai(m, n — k) + V] A 2 (m, k + l)Ai(m, n — k) 

m=0 m=l 

(4.20) 

In order to prove (|4,10p . it suffices to express Ai in terms of A4. For 6 > m > 1 we 
observe that 

J\f(m,b) = {(w,w m ) : w £ M{m - w m = b- j, m- 1 < j < b - lj. 
Hence, for b > m > 1, 

6-1 b-m+l 

Ai(m,b)= A i (m-l,j)V i (b-j)= vl^m - 1, 6 - (4.21) 

j=m-l 1=1 

Observe that (|4.21)) is still valid if m = 1, since Aj(0, 6) = l{b=o} an d -Aj(l,6) = Vi(b). 
The decomposition (|4.21|) permits to transform (|4.2U|) into (|4.1Up . ■ 

Remark 4.2. In the particular situation a 2) k = ct 2 for any k > 1, which is associated with 
the simple infinite comb (Section^, then the distribution of S n given by (|4,9p and (|4.10p can 
be simplified since 

A 2 (m,b) = ( b ~\){l-a 2 ) b - m a 2 n = a 2 A 2 (m,b), b>m>l. 
\m — 1/ 

Of course by symmetry we get also a similar expression of A± if a\ & = «i for any A; > 1. 
Combining both identities, Proposition 14. II gives the distribution of S n when X n is a Markov 
chain. Let us just note that the associated VLMC is very particular and the generating 
function of S n was already presented in [15j. 



Corollary 4.3. Suppose that a\j. = a\ and a 2 ,fc = a 2 f or an V k > 1. This means that {X T( 
is a { — 1, l}-valued Markov chain with transition matrix 

1 — a\ ai 
a 2 1 — a 2 

Then one has 

(fc+l)A(n-fc) 



p(L n (i) = k)= y L _ 1) ( n m _ 1 ^r'a - «i) n - fc - m ^(i - oa) 

m=l \ / \ / 

fcA(n-fe) / ,n / _r,_ 1 \ 

+ V ( IP <(1 " ai) n - fc - m a 2 ra (l - a 2 f~ m . 

. \mj \ m — 1 / 



k-\-l—m 
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Remark 4.4. (i) Note that we have actually proved a more complete result than (|4,lip 
and (|4TT% 

P(L n (l) = k, T 2m < n < T 2m+1 ) = A 2 (m + l,k + l)A x (m, n-k), (4.22) 
for 0<m<kA(n — k) and 

F(L n (l) = k, T 2m _i <n<T 2m ) = ^ 2 (m,A: + l)li(m,n-A;), (4.23) 
/or 1 < m < (A; + 1) A (n — k), where A\ and A 2 are defined by (I4.18p . 
(ii) We deduce from (I4.22p that 

n—m 

lP(T 2m < n < T 2m+1 ) = M™ + l,k + 1)A 2 (m,n - k). 

k=m 

Since the left hand side equals ¥(T 2m < n, T 2m+ i — T 2m > n — T 2m ), Proposition \2J^ 
and Remark \2. 71 imply 

n—m 

nHT 2m <n}V 2 (n - T 2m )] = Mm + 1, k + 1)A 2 (m,n - k). 

k=m 

Recall that T 2m > 2m. Then taking successively n = 2m, n = 2m + 1 and so on, we are 
theoreticaly able to determine the law ofT 2m . 

As it is said in Remark I4.4| Proposition 14.11 contains in an hidden way the distribution of 
T 2m and T 2m+ i. However it is actually possible to determine differently the distribution of 
these two random variables. It is convenient to introduce the notations: 

A#(n) = g(n) -g(n + l), n > 0, 

n 

ip * ip(n) = (f(k)Tp(n — k) n > 0, 
e:N^n 9(n)=n + l, 

where g, ip, ip : N — )■ N. 

Proposition 4.5. Let be k independent ^-valued random variables. Denote for any 

n>0, 

fi(n) :=P(^>n). 

We introduce Aj? the set of all subsets of {1, . . . , k} containing r elements. Then 

P(£i + ---+&>n) = /» fc (fO 

where 

k 

r=l AeA* 

and f* A = f h * . . . * f ir when A = {h, . . . ,i r }. 

We do not prove Proposition 14.51 since it does not play a main role in our study. 

Remark 4.6. 1. If £ is geometrically distributed with parameter 1 — p (i.e. P(£ = n) = 
(1 — p)p n , n > 1, p £]0, 1[) then the function f associated with £ is f(n) = p n , n > 0. 
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2. Suppose that £1 = Ti m ~ ^2m-l (resp. £2 = T^lm-X ~ r ^2m-2) where m > 1, £/ien Remark 
\2. 71 implies that 

F(6 - 1 > n) = Pi(n + 1), i = 1, 2, n > 0, 
where V% has been defined by 



Definition 4.7. Let p E]0, 1[. ~N-valued random variable £ p is said to be pseudo-Poisson 
distributed with parameter p > when for all n > 0: 



/ P (n) = P(£ P > n) = P - 
n! 



It is clear that if a^fe = 1 — where pi g]0, 1[, then Vi(n) = Therefore & — 1 (cf item 2. 
of Remark 14. 6p is pseudo-Poisson with parameter pi . 
It is immediate to prove that: 

fp * fp' = fp+p' ■ 

Reasoning by induction on k and using Proposition 14. 5| we get the following result. 

Proposition 4.8. Suppose that are independent, and £j is pseudo-Poisson with pa- 

rameter pi. Then: 

H£l + --- + £k>n) = h k (n), ra>0, 

u>/iere 

fc r— 1 « / \ n+k+l—r 

M") = EEE('i') ( „ + i'+f _ r)! E* ■ C«<) 

In t/ie particular case p\ = . . . = pt = p, 

«»)-e^(V)|(!)h)'.«-..-) 



Remark 4.9. Suppose that = 1 — %, k >1. Then 

IP (X^ 2 * " r »-i ~ 1) > "J = « > 0, 

where h k is given by (|4.24H . 



4.3 Distribution of the persistent random walk at an independent time 

As shows Proposition 14. 1[ the law of S n is rather complicated. In the study of a Markov chain, 
it can be interesting to stop it at a random time. For instance, a Markov chain stopped at a 
geometric time independent from the Markov chain remains a Markov chain. 
Let us consider a geometric random variable r + 1 with parameter p s]0, 1[ and independent 
from (X n ,M n ): 

F(r = k)=p k (l-p), k>0. (4.25) 
In this section we first determine in Theorem 14.101 below the generating function $(A,p) of 

$(A,p):=E[A s 1 = (l-p)5> fc E[AH < p < 1. (4.26) 

k>0 

This would allow to deduce the generating function of for any since: 



iSkl _ 1 d k f$(\, P ) 



k\ dp k \ 1 — p 



(4.27) 

p=0 
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Since we have already calculated the law of we do not go further in this direction. 
In Section we will prove that under certain conditions, the persitent random walk (S n ) 
converges to a Markov process (5 , (t)) ig]R+ . The following Theorem 14.101 will be used to cal- 
culate the Laplace transform of S(£), where £ is an exponential random variable independent 
of (5'(i))t e K + - Theoretically, the following theorem permits to deduce the law of S T but it is 
in practice impossible to determine it explicitely. However, using the law of L n {\) for any 
n, given in Proposition 14.11 we present in Proposition 14.131 below the distribution of L T (1). 
Recall that from (|4.8p . S T = 1 + 2L T (1) — r. Since r is a random time, we cannot deduce 
from this indentity the distribution of S T . 

Theorem 4.10. Let < p < A < 1. Then the generating function of S T , where (S n ) n >o an d 
t are independent, is equal to 

(p - 1) {\p (Vx {{) + V 2 (Xp)) + (X P - l)Vi (£) V 2 (Xp)\ 

E[A St 1 = i ^ '- ^ k (4.28) 

p(\p - l)V 2 (Xp) + \ P {p - X)V! (£) + (Ap -l)(p- \)Vi {{) V 2 (Xp) 

where V% is defined for < x < 1 by 

Pi(x) = J2'Pi(k)x k , * = 1,2. (4.29) 

k>l 

Remark 4.11. If a 2l k = ct 2 for any k > 1 (recall that in that case S n is the persistent 
random walk associated with the simple infinite comb), the function V 2 satisfies 



p 2 (x) = J2(l-a 2 ) k - 1 x k = J 
Therefore (|4,28p becomes 



1 — a 2 )x 
k>i v A > 



A(p - 1) 1 - a 2 Vi 

E[A 5t ] 



\p-l + a 2 X(p - X)Vx (£) 
Moreover, if = 1 — a\/k, then 



a k ~ 1 x k 



(A; 
k>l K 

and 

E[X 



SVl _ (p-l) {X-a 2 pe a ^ x ) 
Xp-1 + a 2 p(p - X)e ai f/ X ' 



We begin with a preliminary result (Lemma 14. 12[) . The proof of Theorem 14. 101 will be given 
later on. For i € {1,2} and < x < 1, let us define the generating function 

k>l 

Lemma 4.12. (i) For i = 1,2 and < x < 1, the generating function satisfies 

g^(x) = l+(^)p i (x), (4.30) 
where Vi(x) has been defined by (|4,29p . 
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(ii) Moreover for m > 1, 

m— 1 

^(m^x = [G w (x)' 

6>m fe>m 



^^(m,6)x 6 = (<? (i) (x))"\ ^^(m,6)x fe = '^(x), (4.31) 



where Ai (resp. Ai) is defined by (|4.6p (resp. (|4.18|) ). 
Proof of Lemma [4.121 
(i) Let < x < 1. We have 

0(0(3) = _^p.(fc)(l _ a . fe ) x fc + ^^(fc) : 
fc>l fc>l 

= "- y"^(^ + l)x fc+1 + ^(x) = --(^(x)-x)+^(x). 

i i ^^^^^ ' v ' 



)x fc 



fc>l 



(ii) For m > 1, 



Y,A(m,b)x b = Vi{u!)...Vi{ 

b>m b>m, u£j\f(m,b) 

= E (^(^l) <Xi,uiX Ul ) ■ ■ ■ (^i(«m) 
we(N*) m 

= (gW(x)) m . 

The proof of the second equality in (|4.3ip is similar to the first one. ■ 

PROOF of Theorem 14.101 Let < p < X < 1. Using (|4.8p together with the independence 
between S n and r yield 



E[X St ] = (l-p)^E[A 5 > n = A(l-p)^E[A 2L " (1) ] 

n>0 n>0 



2i„(D] (£) n 

A(l-p)^A 2fc ^r ?n (fc)(f)'\ (4.32) 



fc>0 n>k 

See Proposition 14.11 for the definition of r/fc(n). Using the decomposition (|4.13p and equality 
(|4.17p lead to the following decomposition 

E[A S 1 = \(l-p)(£ 1 +£ 2 ), (4.33) 
where £{ corresponds to the part related to rj^ (cf (|4.11|) and (|4.12p ) i.e. : 

kt\(n—k) 



k>0 n>k m=0 
k 

J2(Xp) k J2Mm+l,k + l) £ A 1 (m,n-k)^y~ k 

k>0 m=0 n>m+k 
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By (OTj) . we get 



*i=£(Ap)*E£(m + !,* + !) (G^Q) m 



fc+1 



m,>0 fc>m 



In a similar way, we compute £2: 

(fc+l)A(n-fc) 



^ 2 = E A2fc E £ q£°(*,»o(f 



fc>0 n>fc m=l 

£(Ap) fe £A 2 (m,* + l) £ Mm,n-k)(£ ~ 

k>0 m=l n>m+fc 



m>l k>m— 1 

Pi(£)0 (2) (A/O 



Now (|Q5j> yields 



1 1 X P l-QV{$)QV>{\p) 
which, combined with (|4.30p . implies (j4.28p . 

Proposition 4.13. Let k > 0. TTie random variable L T (1) satisfies 

fc+1 

\m— 1 



with 



Moreover 



P(L T (1) = fc) = (1 — p)/{<?i(p) E fc + ViW 

m=l 
fc k—m+l 

+ £ £ ^(m^ + l-^^/^pr} (4.34) 

m=0 £=1 

/i(p) = £Pi(fcK*P* and gi(p) = Y, V i( k )p k > » = 1,2. (4.35) 



fc>l fc>i 



/i(^)=(l--)ft(p) + l. (4.36) 



P, 

PROOF. Let us first recall (cf Proposition 14. 1|) that 

:= P(L„(1) = k) = n^(k) + V W(k), 

where rjn resp. rjn is defined by ([4. lip resp. (I4.12p . In a similar way, we decompose the 
following probability 

rj(k) := ¥(L T (1) = k) = rP{k) + n {2) (k) (4.37) 
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where 

'I 



^(k) = (l-p)Y J P n 4 ) (k), i = 1,2. 



n>0 

We shall only present the details of calculation for (k) (r/ 2 ) (k) can be determined simi- 
larly). By definition 

rj^(k) = (1- p)J2Mm,k + l)A 1 (m-l,n-k-£)Vi(e)p n , (4.38) 

the sum is taken over all combinations of indexes n, m, and £ satisfying 

n > 0, m > 1, m < k + 1, m < n — k, £ > 1, £ < n — k — m+1. 

Let us first fix the indexes m and ^ with 

1 < m, m<fc + l, £>1. (4.39) 

Then we compute the sum with respect to n. We therefore introduce 

i>m,l{n) := E A x (m-\,n-k-£)p n . 

n>i+k+m—l 

By the change of variable i = n — k — £ — m+1, we get 



Jz+t pin-l+i 



tpm,e( n ) = y]Ai(m — l,m— 1 + i)P* 

i>0 

= p k+e^2 7>i(«i) x ... x Vi(u m -i)a liUl x ... x 



j>0 Ml,...,M m _l 
„Ul + ...+M„ 



7tl-t-...-|-tt m _l 

x P JL {«i+... +u m _i=m-l+i} 

= P' 



+ ' ^ x ... x PiK^Oai,^ x ... x a 1)Um _ l/9 " 1 +-+"™- 1 



= P W (/1(P)) T 

where f\ is defined by (|4.35p . Let us just note that, in the particular case m = 1, we get 
Ai(0,n — k — £) = l{ n _fc-£=o} an d 4>i/{k) = p fc+ ^. Using (|4.38p we obtain the following sum 
over all indexes m and i satisfying (|4.39l) : 



V 



(l) 



(fc) = (1 - p) A 2 (m, k + (/i^))™" 1 



when m,£ verify (|4.39p . Then 



V V(k) = (l-p)p k gi (p) [j;A 2 (m 1 fe + l)/i(pr 1 ] (4.40) 

\m=l / 



where <?i is defined by (|4.35p . 
It can be proved 



k k— m+1 

V 

m=0 e=i 



W(k) = (l-p)p k Y E A 2 (m,k + l-£)V2(£)h(p) m . (4.41) 



Obviously (|Q7|) . and (jOTj) imply Let us finally prove ([Q5]) : 

/i(p) = 5>(*W = E 7 ^ (!-(!- «<,*)) ^ = »i(p) " + V 



fe>i fc>i fc>i 

ft 



- - E = 9i{p) - \m{p) - p) = (i - - Wp) + 1. 



p 

r k>2 
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4.4 Large time behavior 

The law of S n has been given explicitely in Proposition 14,11 but it is very complicated. This 
leads us to investigate the asymptotic behaviour of S n as n Y oo. 

Proposition AAA. Assume that 0j < oo, i = 1,2, where Oj is defined by (|2,7p , 

(i) The ratio — converges a.s. and in L 1 to — — as n — > oo. 

n 0i + 6 2 

(mJ Moreover, if X]fc>i kVi(k) < oo /ori = 1,2, £/ien t/ie Central Limit Theorem holds: 

1 (4-42) 



v^T V 6i + G s 

converges in distribution to a standard Gaussian random variable as n — > oo and f/ie 
constant T is defined by 



T = E 

6i + e 2 



9i + e 2 , 

where the stopping times T\ and T 2 are defined by (|2.17p and = Mq = 1. 



(4.43) 



Remark 4.15. i. Let us first note that, under the condition presented in (ii) we can also 
prove the existence of a constant C G M swc/t £/iai 

hm (e(5„) - n ® 2 ~® 1 | = C. (4.44) 



n-»oo Bi + 02 

in t/ie particular case @i = 2 < oo, Proposition 4-M\ implies that lim^^co E ^"^ = 0. 
If moreover X*fc>o ^"tW ^ °°> we have a more precise result which says that -^S n 
converges in distribution to a Gaussian random variable. 

3. Under the conditions @j < oo and ^2k>ikVi(k) < oo, we observe therefore that the 
rates of convergence for the first and the second order limit theorems are similar to the 
rates in the setting of the classical Bernoulli random walk. The persistency does not 
change the long time behaviour. 

4- The assumption ^2 k>1 kVi(k) < oo is quite strong and force a relatively strong mixing 
in the sequence (X n ). Open and interesting questions occur when this assumption is 
not satisfied. In terms of VLMC, it corresponds to the case when the expectation of the 
length o/ pref (U n ) is infinite. 

Proof of Proposition 14.141 

(i) Proposition 12.31 ensures that, under the condition 0j < oo, for i G {1,2}, the process 
(X n , M n ) n >o is an ergodic Markov chain with invariant probability v. The ergodic 
theorem, Corollary 13.21 and (|2.7p imply the following almost sure convergence result: 

l im h^L = N) = — ^— a.s., (4.45) 
rwoo n Hi + B 2 

where L n (l) is defined by (|4.7I) . Since L n (l) jn is a bounded random variable, the almost 
sure convergence implies the moment convergence. Therefore, by (|4.8p and (|4.45p . we 
obtain 

E(S n ) l + 2E(L n (l)) 29 2 8 2 -Gi 
hm = hm 1 = — — 1 = — — . 

n->oo n n->oo n fc>i + fc) 2 fc>i + W 2 
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(ii) Let us consider the Markov chain (X n ,M n ) n >Q starting at (1,1) and denote Q the 
associated transition probability and v the invariant measure. We define 



a = inf{n>l: (X n , M n ) = (1,1)} ■ 



(4.46) 



Since the Markov chain is reccurent irreducible and positive, the stopping time a is 
almost surely finite. Moreover if E[o" 2 ] < oo. Theorem 17.2.2 in |12] implies that (|4.42p 
holds with the constant 



T := i/(l,l) E 



\k=l 



e 2 -9i 
Gi + e 2 



(4.47) 



According to Definition ([2.17p of the stopping times (T n ), one has a = T 2 and conse- 
quently 

a T 2 -l T 2 -l 

Y,X k =Y^X k +^X k + X T _ 2 = T x - 1 - (T 2 - Ti) + 1 = 2Ti - T 2 . 

fc=l fc=l fc=Ti 

From (|3.6p and (|4.47p . we deduce (|4.43p . It remains to prove that a is square integrable. 
Since a = T x + (T 2 - T x ) and T 2 - 2\ > 0, E(cr 2 ) < 00 if and only if E[T?] < 00 and 
E[(T 2 - Tt) 2 ] < 00. Using Proposition EH we have: 



N 



n>l 



E[Tf] = Y^n 2 r 2 (n) a 2 , n = - lim Vn 2 P 2 (n) ((1 - a 2 , n ) - 1) 

1 n=l 

/ iV AT \ 

lim ^n 2 P 2 (n + l)-^n 2 P 2 (n) 
\n=l n=l / 

JV 

<P 2 (1)+ lim V (n 2 - (n - l) 2 )V 2 (n) 

n=2 

N 

<1+ lim V (2n - l)P 2 (n) < 1 + 2 V nP 2 (n) < 00. 

Af — ±r*^ — J — ^ 



n=2 



n>l 



Using ()2.19p and similar arguments, we obtain that E[(T 2 — T\) 2 \ < 00. 



5 From persistent random walk to generalized integrated tele- 
graph noise (GITN). 

Let (X n , M n ) n >Q be a { — 1,1} x N*-valued Markov chain satisfying (|2.ip and (|2.2p and let 
(S n )n>o be the associated persistent random walk, see (|4,ip . We assume in this section that 
the transition probabilities (cti >n ) depend on a small parameter e > and s appears also both 
in a time scale and a space scale of the persistent random walk. We prove that there exists 
a normalization expressed in terms of e so that (X n , M n , S n ) converges in distribution as 
e — > 0. This limit is a time continuous process. Such a procedure has been already performed 
in |10| when the increments are a Markov chain. 
More precisely we suppose that the transition probabilities satisfy 

<Xi,n = fi(ne)s + ai, n ,e£, n > 1, i = 1, 2 (5.48) 
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where f\ and / 2 are positive right-continuous functions with left limits satisfying 

POO 

/ fi{u)du = 00, 7 = 1,2 (5.49) 
J 

and aj irtj£ G M with lim^o supj n \cii^ ntE \ = 0. It is clear that for any i, n fixed, lim e _-.o Oj in = 0. 
Therefore changes from —1 to 1 (for instance) with a small probability. The trend of (X^) 
is to stay at the same level. 

Let us now introduce the scaling procedure. For any e > and for any t G eN, we define the 
processes 

S £ (t) = eSt, M £ (t)=eMt and X £ (t) = Xt. (5.50) 

£ £ £ 

Note that (S n ) depends on e, since the two families of coefficients (ai, n ) an d (a-2,n) depend 
on e. For the sake of simplicity, we do not mention the dependency with respect to e. We 
extend the definition of the process (S £ (t), t G eN) to t G R+ by linear interpolation and we 
the definition of the processes (X s (t), t G eN) and (M £ (t), t G eN) into piecewise constant 
right continuous with left limits functions. In order to describe the asymptotic behavior of 
(S £ (t), t > 0) as e — > 0, it suffices to study the asymptotic properties of the times of trend 
changes. Indeed t — > S £ (t) admits a 1 slope till the stopping time eT±, with T\ defined by 
(|2.17p . After that instant, the paths admits a —1 slope till eT2 and so on... The increments 
change periodically from —1 to 1 and vice versa. 

As e — > 0, we shall prove that the limit process (S*°(t), t > 0) is still piecewise linear. More 
precisely it starts at t = with a slope equal to 1. At a random time time e± the slope 
changes and becomes equal to —1, at random time e± + e<i we observe a new change of slope 
and so on... We are therefore particularly interested in the description of the distribution of 
(e n )n>i- 

Theorem 5.1. 1. Let us consider a sequence (e„,) n >i of independent random variables such 
that for n > 1, 

P(e 2n -i >t)= exp (- J f 2 (u)dv)j , P(e 2n > t) = exp (- /i(u)dw) , (5.51) 
where f\ and / 2 have been introduced in (|5.48p . Let 

N°(t) := ^ l{e 1+ ...+e„<t} ; for anyt>0 
n>l 

be the counting process, 

m(t) :=t — sup{ei + . . . + e^ : e\ + . . . + < i] = t - T N o(^ 
the associate age process ( spent life ) and finally 

S°(t)= [\-l) N ° {s) ds, t>0. (5.52) 
Jo 

the so-called Generalized Integrated Telegraph Noise (GITN). 

2. Let (X n , M n ) n >o be a { — 1,1} x N* -valued Markov chain whose probability transition 
satisfies (|2,ip and is e-dependent in the sense of (|5.48p . We assume Xq = Mq = 1. 

(i) For all n > 1, the sequence of times between two consecutive slope changes (eTi,e(T 2 — 
Ti), . . . , e(T n — T n _i)) converges in distribution towards (ex, ... , e n ) as e — > 0, where 
the sequence (Tk)k>o * s defined by (|2.17p . 
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(ii) The following convergence in distribution in Skorohod's topology holds 

(S £ {t),X £ (t),M £ (t),t> 0) — > (s (t),(-l) N °^,m(t),t > O) , (5.53) 

£->0 V / 

where S £ (t), M £ (t) andX £ (t) are defined by (pT50j) . 

Moreover (s°(t), (-l)^*), m{t), t > oj and ^(-1) N °^\ m(t), t > oj are Markov pro- 



cesses. 



Remark 5.2. (i) In the case Xq = — 1, the family of processes (S £ (t))t>o converges in 
distribution to (S°(t))t>o as e goes to zero, where for any t > 0, 

S°(t) = - f\-lf°^ds, and N°(t) = £ H {ei+ ... +e „ +1 < t} . 

J ° n>l 

In the particular case where the functions f% and $2 are constant, it has been proved in 
J10\/ that a particular solution of the telegraph equation can be represented in terms of 
S°(t). That explains that (S°(t)) defined by (|5.52p is called the Generalized Integrated 
Telegraph Noise (GITN). 

(ii) In the classical integrated telegraph noise flOf . the random variables (e n , n > 0) are 
exponentially distributed, therefore (S° (t) , N° (t)) is Markovian. For the generalized 
situation, this property is not true anymore, we need to consider some additional infor- 
mation. This information is given by D_ the left derivate of the GITN which is directly 
related to the age process 

m(t) = t - sup{s > : D_S°(a) + D-S°(t)}. 

(iii-a) Davis wrote in ^ that "almost all the continuous-time stochastic process models of 
applied probability consist of some combination of the following: diffusion, deterministic 
motion and random jumps ". According to Theorem \5.1\ between two consecutive random 
jumps the GITN moves in a deterministic way and therefore belongs to the family of 
the so-called Piecewiese Deterministic Markov Processes, see for instance |5|, 0, ^Jfl. 

(iii-b) The possible values of X°(t) are { — 1,1}. It is possible to deal with the case where 
X°(t) E {ai, . . . ,ax}- In that case X (t) is a Markov chain indexed by IR + and 
{a\, ... ,o,k} -valued. This situation has been already treated in [10], when the func- 
tions {fi)\<i<K are constant. 

(iii-c) (S°(t); t > 0) is a semi-Markov process, see £2 [77]/ . In (Theorem 3.3 in Chap- 
ter 4) H has been proved that (X £ (t); t > 0) converges to the semi-Markov process 
(X°(t), t > 0). This result is weaker than ours since we have considered the conver- 
gence of (S £ (t),M £ (t),X £ (t)) t > . 

Proof. 



Step 1 — Convergence of the jump times. Let us define lZ £ n := (sTi, ST2, . . . , eT n ) for 
n > 1. According to Proposition 12.61 (T n — T n _i) n >i is a sequence of independent random 
variables. In order to prove the convergence in distribution of Tt £ n as e tends to 0, it suffices to 
analyze the behaviour of e{T n — T„_i) where n > is given. Recall that To = 0. Remark 12.71 
and (l2~T9l) yield: 

P (e{T 2n+l - T 2n ) > t) = P (r 2n+l - T 2n > ?j = P (r 2n+1 - T 2n > 

= (1 - 0:2,1) x ... x (1 - a 2) L±j), 
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where [a\ stands for the integer part of a. Defining 

[t/e] 

5 £ (t) := log {p(e(T 2n+1 - T 2n ) > t) } = log(l - a 2J ), 



and using (I5.48p . we get 



[t/e\ 

s e(t) = ^2 log (l - ef 2 (je) - a 2 ,j, E e\ . 
i=i 

Due to the continuity of the function f 2 and to the uniform limit of a to zero, 

lim<5 e (i) = - lim e V f 2 (je) = - / f 2 (u)du. (5.54) 
i=l J0 

Hence for any t > 0, 

lim P (e(T 2n+ i - T 2n ) > t) = exp (- J f 2 (u)du^J . 

The same arguments lead to 

hmP(£(r 2n+2 - T 2n+1 ) >t) = exp (- j f x (u)duj . 

We conclude that 1Z £ n converges in distribution towards (ei, e± + e 2) . . . , e± + e 2 + . . . + e n ), 
for any n > 1. 

Step 2 — Duality and convergence of the counting process. Let us define the fol- 
lowing right-continuous counting process: 

N £ (t) = sup{n > : eT n < t) = t{ £ T n <t}- (5.55) 

n>l 

In order to prove ()5,53p we first point out the convergence of the counting process N e towards 
iV°. The one-to-one correspondence between (N 6 (i))t>o and (T n ) n >i implies that for any 
< ti < . . . < tj~, the convergence in distribution of (N e (ti), . . . , N £ (t n )) as e tends to zero 
is a consequence of the convergence of Vf n . Indeed 

P(iV £ (ti) = ii, . . . , N £ (t n ) = j n ) = F(eT h <t x < eT n+1 , eT jn < t n < eT jn+1 ) 

and consequently 

lim P(iV £ (ti) =ji,..., N £ {t n ) = j n ) = P {E n <h< E jl+1 , ...,E jn <t n < E jn+l ) , 

where E n = &k- In order to obtain the convergence of the counting processes, it suffices 

to use a tightness criterium (see, for instance, [?, Theorem 15.2 p. 125]). Let s < t and let 
us denote T s t '■= l^/ e J — l s / £ \ then 

d £ s t ■= P(N £ {t) > N £ {s)) = 1 - ¥{N £ (t) = N £ (s)) 

= 1 - F(N £ (t) = N £ (s), N £ (s) G 2N) - F(N £ (t) = N £ (s), N £ (s) € 2N + 1). (5.56) 
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Since Xq = 1, if N £ (s) G 2N we have on one hand X^ s / £ ^ = ^ anc ^ on *he °ther hand 
M Ls/e j < [s/e\ +1. Assuming M^j = ^ + 1 with < £ < [s/e\ then 

P st {£) := p(iV £ (t) = iV e ( S )|M Ls/e j =1+1, X ls/£] = l) 

= P(x Ls/£ j +1 = 1, • • • , ^L s / e j +Tst = 1 A^Ls/eJ = ^ + -^Ls/eJ = 1 ) 

= (1 - a 2 /+i){l - a 2 / +2 ) ... (1 - av+TsJ. (5.57) 

Then it comes, 

[s/e] 

P(N £ (t) = N £ (s), N £ (s) G 2N) = W[N e (t) = N £ (s), N £ (s) G 2N, M [s/el = t + l] 

e=o 

L S / £ J Tst 

= E II^ 1 - «2,^)P(iV £ ( S ) G 2N, M Ls/eJ = £ + ^ 

£=0 fe=l 

Tat 

> inf TT(1 — a>2,k+eW(N e (s) G 2N). (5.58) 

fe=l 

Similar arguments are used in the odd case N £ (s) £ 2N+1. In this situation -^[s/ej = — ^ anc ^ 
the sequence (a2,») in (|5.57p is therefore replaced by Combining (j5.58|) with (|5.56p . 

we obtain 

Tst T s t 

d% t < 1- inf , nCi-^fc+O^W e 2N)- inf [[(l-a^wjPiFfa) 6 2N + 1). 

0<Z< [s/e\ ^ 0<£< Ls/eJ ^ 

By (I5]1SD, we get 

dt, t < 1 - taf { o ^nf /£j | (l - sfMk + £))}+ o(e) 

< 1- .inf (fl-e sup h(u)) T3t ) +o(e) 



8=1,2 



0<M<t+£ 



<l-(l-£ sup /i( U )V/ 2 (n)) r "+o(e). 

0<U<t+£ 



Since er s t < t — s + e, for any 5 > 0, N > 0, we can find eq > such that d £ st < 5 for all 
e < £o and t,s < N. We deduce that the set of all the distributions of N £ , s G]0, 1], is weakly 
relatively compact and obtain finally the convergence in law of N £ towards iV . 



Step 3 — Convergence of (S £ , X £ , M £ ). We have just proved that (N £ (t)) t >o converges 
in distribution towards (N°(t))t>Q. The paths of these processes belong to the Skorohod 
space D. The two main ingredients of the proof are the following. First we note that S £ (t), 
X s (t) and M £ (t) can be expressed continuously in terms of the process (N £ (s), s < t) and 
secondly we use the convergence of N £ . For the process S £ (t), we introduce the mapping 
F x : D(0, 1) C(0, 1) defined for t G [0, 1] by 

Fl (/)(*) = f CO S (7Tf( S ))ds. 

Jo 

Since N £ is N-valued, we get 

F 1 (N £ ){t)= f cos(7rN £ (s))ds= [\-l) N 'W ds. 
Jo Jo 
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Note that ([530]) combined with flOJ) imply that S e (t) = e + Jq(-1) n ^ s+£ Us. Finally the 
definition of S £ (t) leads to 



\S £ (t)-F 1 (N £ )(t)\ = e+ (-l) Ne(s+£ Us- / (-lJ^Wds 



< 3e. 



(5.59) 



For the process X £ , we observe that X £ (t) = F 2 (N £ (t)) := cos(nN £ (t)) and the memory 
process is linked to the age process of N £ : 

M £ (t) - ft - inf{s > : N £ (s) = N £ (t)}^ < e. 

Let us just note that for $(x) = cos(|x)l|[_ 11 ]}(a;) which is a continuous function, we get 

t-inf{s>0: N £ (s) = N £ (t)} = [ <S>(N £ (t) - N £ (s)) ds = F 3 (N e )(t) 

Jo 



where 



^3 : /-> 



Hf(t)-f(s))ds, t >o). 



In order to prove (|5.53|) . it suffices to use the convergence in distribution of N £ towards N° 
developed in Step 2 and the continuity in the Skorohod topology of the three functions Fx, 
F 2 and F 3 (see Lemma [AU [Q and CO). Finally we note that (-l)^*) = Fi(N°(t)), 
S°(t) = F 2 {N°{t)) and m(t) = F 3 (N°(t)). ■ 

Examples. For some particular fx, the related random variable e 2n has a distribution which 
belongs to well-known families of laws. 

• If fi is a constant function then the sequence {&2n) ls exponentially distributed. 

• If fi(x) = a\x a ~ l with a > and A > then the law of e2 n corresponds to the Weibull 
distribution with parameters (a, A). 

• If fi(x) = j H-{x>x } with xq > 0, then we deal with the Pareto distribution for e2n- 

It has been shown in [10J that the density part of the distribution of S(t) can be expressed 
via Bessel functions. Here, we have a weaken result which says that we are only able to 
determine the Laplace transform of S(t) (see, Proposition 15.31 below). Being unable to invert 
this transformation, the distribution of S(t) is unknown. Although the path description of 
(S(t))t>o is very easy only few properties related to the GITN are known. 

Proposition 5.3. Let (S°(t))t£s. + be the GITN defined by (|5.52p then the double Laplace 
transform defined by 



C 



(r,7) := / 
Jo 



-rt 



E 



-7S°(*) 



dt, r > 0, 7 > 0, 



(5.60) 



is equal to 



where 



-(r + j)K(j- - 7, h)TZ{r + 7, f 2 ) + K(r - 7, h) + K(r + 7, / 2 ) 
(r - 7 )K(r - 7, ft) + (r + 7 )K(r + 7, f 2 ) - (r 2 - 7 2 )^(r - 7, f{)n{r + 7, /a) ' 

U(z, fi) = / e~ zt ~k f^ u ) du dt, zeR, i = 1, 2. 



(5.61) 



32 



Remark 5.4. (i) In the particular constant case, that is fi(t) = f\ and f2(t) = fi for 
all t > 0, the stochastic process corresponds to the so-called integrated telegraph noise 
introduced in flO^ . For this process, we get lZ(z,fi) = (z + fi)~ 1 for i = 1,2. The 
double Laplace transform C becomes 



fo + 9o + r - 7 



r 2 - 7 2 + (r - 7)50 + (r + 7)/o ' 



This identity was already obtained by Weiss in f!8\/ and presented in flOtf (see Remark 
3.10). 

(ii) Let £ be an exponential r.v. with parameter r independent from (S°(t), t > 0). Then 
C(r, 7) is the Laplace transform of S°(£): 

£(r,7) =E[e-? s °«)]. 

PROOF of Proposition 15.31 Recall that 5 e (i) is the piecewise continuous process defined 
by (|5.50p . By Theorem 15.11 and the Lebesgue convergence theorem, we just need to study 
the convergence of C e (r, 7) the double Laplace transform of S £ (t). As e — > 0, we get 



£ £ (r, 7 ) 



/ e~ rt E I e-T^W dt = J2 e~ rt [E[e-^ Sk ] + o(e)J 

- fc>cr fc£ 



1 -e r 



E 

fe>0 



E 



(5.62) 



where r + 1 is a geometrically distributed random variable, independent of the process (S n ): 

P(r = n) = (e- r£ ) n (l-e- re ). 

Obviously (|5.62p shows that rC e (r, 7) and E[e _7£ ' Sr ] have the same limit as e — > 0. Note that 
choosing A = e _7£ and p = e~ T£ in Theorem 14.101 gives the value of E[e _7£ST ]. Due to the 
specific form of fl4.28j> we are lead to prove the following intermediate result: 



\mxeVi{e- £Z )=n{zJi). 

where Pi (resp. lZ(z,fi)) is defined by (|4.29p (resp. (|5,6ip ). 
Indeed, according to the definition of Pi we easily get 



(5.63) 



eV l (e- £Z ) 



ZE 



Pi 



1 - e~ £Z 

Using (|5.54p (where the index 2 is replaced by i) yields 



+ l)e- zt dt. 



lim Pi 



t 






e 





Then, the dominated convergence theorem implies (|5.63p . Since 

• S £ (t) converges in distribution to S°(t) as e — > 

• p — 1 ~ —re and Xp — 1 ~ — (r + 7)e as e — > 
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then (|5.62p and Theorem 14.101 imply 

£(r,7) = lim £ e (r, 7 ) = - lim E[(e" 7£ ) 5T ] 

£->0 r £->0 

_ ~(r + 7)-RfO ~ 7)-^l( r + 7) + Rf(r - 7) + ^10 + 7) 

e^o (r - 7)-Rf (r - 7) + (r + 7)-R|( r + 7) ~ ( r2 - 7 2 )- R ! ( r _ 7)- R |( r + 7) 

where i?f(z) = e'Pi(e~ 2:£ ). 

It is clear that Proposition 15.31 is a straightforward consequence of ([5.63p and the above 
identity. ■ 



A Continuity in the Skorohod space 

Let us denote B([0, 1]) the Skorohod space i.e. the space of functions which are right- 
continuous and have left-hand limits. B is a complete metric space for the following distance 
(see [?, Theorem 14.2]) 



where 



d(f,g) = mf max {ll A||, ||/ - g o A||oo|, 
AeA I J 

A(t) - X(s) 



(A.l) 



sup 

s^t 



log' 



t 



II • I loo is the uniform norm and A is the space of strictly increasing, continuous mappings of 
[0, 1] into itself. 



Lemma A.l. Let $ : M — > M be a continuous function, then f G B([0, 1]) 
continuous in the Skorohod topology. 



$ o / is 



PROOF. Let / G B([0, 1]). Then there exists M > such that |/(i)| < M for all t G [0, 1]. 
For e > 0, due to the uniform continuity of <£, there exists 5 > such that: for any 
(x,y) G [— 2M, 2M] 2 satisfying \x — y\ < 5 we have |$(x) — &(y)\ < s. Let us consider 
now a function g G B([0, 1]) such that d(f,g) < 5 A Af. Therefore, there exists A G A such 
that || Alloc < <5 and ||/ — g o A||oo < S. Consequently 



||$(/)-*0/oA)|| oo <E. 

Continuity of $ at / follows from the definition of Skorohod's distance. 



Lemma A.2. The mapping f G 
Skorohod topology. 



UD 



Jq f(u) du, t > 0J is continuous in the 



PROOF. First let us recall that any function belonging to the Skorohod space is integrable. 
We denote I f (t) = f*f(u)du. Let f,g G B([0, 1]) such that d(f,g) < 5 and choose A G A 
with ||A|| < 5 and ||/ — g o A||oo < 5, we get 



\lf{t)-IgOX(t)\ 



lim 

n— ¥00 



*s n. \ n 



< lim 

n— >co 



+ lim 



k=l 
t 

n 



n \ n J 



-9< 



kt\ 



\ n J I \ n J 



-A 



(k - l)t 



n 



)} 



-£(/- 9 oA) - 

k=l 



k=l 



(k - l)t 



n 



-} 



(A.2) 
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By definition of the norm on the Skorohod space, we have 

e -iiAu < m - a(s) „ A || < < 

Consequently for any < s < i < 1, we have 

\X(t) - X(s) -(t-s)\<(t-s) max (e" A " - 1, 1 - e-"*") < (t - s)(e^ - 1). (A.3) 
Combining (|A.2|) and ()A.3P yields to 

II/CtJ-JjoACt)! < ||/ -5oA||oo + 11^(6^11-1), Vie [0,1]. 
We deduce that d(f,g) < 5 implies 

d(If,I g ) < max{d,d + de s \\g\\ 00 } = 5(1 + e <5 || 

9\\oo)- 

As a result / — > Jq 1 f[u)du is a continuous mapping. ■ 

Using similar arguments as those presented in the proofs of Lemma I A. II and Lemma IA.21 
we obtain the following continuity result. 

Lemma A.3. Let § be a continuous function, then the mapping 

f G D([0, 1]) — >• Qf $(/(*) - f(s)) ds,t>0 
is continuous in Skorohod 's topology. 

B Invariant measure for the double infinite comb 

Consider the probabilized context tree given on Figure [3l In this case, there are two infinite 
leaves 0°° and 1°° and a countable number of leaves n l and l n 0, n 6 N. Suppose that ir is a 
stationary measure on C Denote by W the set of finite words on the alphabet {0, 1}. For any 
finite word w £ W, we denote by ir(w) := ir(£w) the measure of the cylinder Cw denoting 
the set of left infinite words ending with w. We first compute tt(w) as a function of 7r(l) 
when the reversed word of w is any context or any internal node. Applying equation (|1.2p to 
U n = . . . 10 n , it comes for any n > 1, 

vr(10-)=^(10"- 1 ) g0 n-i 1 (0). 

An immediate induction yields, for any n > 1, 

n— 1 n— 1 

^(10 n ) = vr(10) J] g ofcl (0) = vr(10) [J(l - a 1>k ) = vr(10)Pi(n). (B.l) 

fc=l k=l 

In the same way, 

^(01 n ) =vr(01)P 2 (n), (B.2) 

The stationary probability of a reversed context is thus necessarily given by Formulae (jB.lj) 
and ()B.2p . Now, if n is any internal node of the context tree but 0, we need going down 
along the branch in the context tree to reach the contexts; using then the disjoint union 
7r(0 n+1 ) = 7r(0 n ) — 7r(10 n ), by induction, it comes for any n > 2, 

n-X 

^(0 n ) = tt(0) - tt(10) V ^ k )- ( B - 3 ) 
fc=l 
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The same holds for any internal node l n but 1, 

n-1 

7r(r) = 7r(l)-7r(10) J> 2 (fc). (B.4) 

k=l 

where we have used 7r(01) = 7r(10) (coming from the invariance of n). The stationary proba- 
bility of a reversed internal node of the context tree is thus necessarily given by Formulae ()B.3P 
and (fB~4)l . 

It remains to compute 7r(10) and then 7r(0) (and consequently vr(l)). The denumerable 
partition of the whole probability space given by all cylinders based on leaves in the context 
tree implies 1 - 7r(0°°) - 7r(l°°) = vr(10) + vr(lOO) + • • • + tt(01) + vr(Oll) + . . ., i.e. 

1 _ ^(0°°) _ vr(l-) = vr(io) £ (V x {n) + V 2 (n)) . (B.5) 

n>l 

This leads to the following statement that covers all cases of existence, unicity and non- 
triviality for a stationary probability measure for the double infinite comb. In the generic case 
(named irreducible case hereunder), we give a necessary and sufficient condition on the data 
for the existence of a stationary probability measure; moreover, when a stationary probability 
exists, it is unique. The reducible case is much more singular and gives rise to nonunicity. 

Proposition B.l. (Stationary probability measures for a double infinite comb) 

Let (U n ) n >o be a VLMC defined by a probabilized double infinite comb. 

(i) Irreducible case: Assume that qo°°(0) ^ 1 and gi°o(l) ^ 1. 

(a) Existence: The Markov process (U n ) n >o admits a stationary probability measure on 
C if and only if the numerical series ©i and ©2 converge. 

(b) Unicity: Assume that the series ©i and ©2 converge. Then, the stationary proba- 
bility measure tt on C is unique; it is characterized by 



an 



d Formulae Bl\) . ( Eg) M . {BJ. 



(ii) Reducible cases: Assume that qo^(0) = 1 and q\°o(l) ^ 1. 

(a) If at least one of the series Gi and ©2 diverges, then the trivial probability measure 
ir on £ defined by 7r(0°°) = 1 is the unique stationary probability measure. 

(b) If the series ©i and 02 converge, then there is a one parameter family of stationary 
probability measures on C More precisely, for any a £ [0,1], there exists a unique 
stationary probability measure n a on C such that 7r a (0°°) = a. The probability ir a is 
characterized by 

rn s a0 2 + 0i , s I- a 

^ (0)= ©7T02-' nail0) = ©7T©^ 

and Formulae TO . ( ED (E2P, (M ). 

Assume that qo°°(0) ^ 1 and gi°o(l) = 1. Then the same results as in (ii.a) and 
(ii.b) hold, exchanging the role o/O and 1. 

Assume that (/o 00 ^) = 1 and gi°o(l) = 1. 
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(c) If at least one of the series @% and 2 diverges, then there is a one parameter family 
of stationary probability measures on C More precisely, for any a E [0,1], there 
exists a unique stationary probability measure n a on C such that 7r a (0°°) = a. The 
probability ir a is characterized by 7r a (0 n ) = a and ir a (l n ) = 1 — a for every n > 1 
and 7r a (w) = as soon as w contains one and one 1. 

(d) If the series Q\ and 2 converge, then there is a two parameters family of stationary 
probability measures on C More precisely, for any a € [0,1] and b € [0,1], there 
exists a unique stationary probability measure TT a ^ on C such that 7r o {,(0°°) = a and 
7r a] b(l°°) = b. The probability 7r a j, is characterized by 

a9 2 + (1-6)9! 1-a-b 

= e 1 + e 1 ' 7ra ' 6(10) = eTTe^ 

and Formulae (O) . ( Eg) ftO|) . (ggp . 

Proof. 

(i) Assume that Qo°°(0) 7^ 1, 7^ 1 and that 7r is a stationary probability measure. By 
definition of probability transitions, 7r(0°°) = 7r(0°°)go°° (0) and 7r(l°°) = 7r(l°°)gi«>(l) 
so that 7r(0°°) and 7r(l°°) necessarily vanish. Thus, thanks to (1B.5J) . 7r(10) 7^ 0, the 
series ©i + 2 converges and so do Gi and 2 . This also implies 

1 = 7r(io)(6i + e 2 ). 

Passing to the limit in (|B.3[) implies 7r(0) = 7r(lO)0i. Thus Formula (|F3.6[) is valid. 
Moreover, when w is any context or any internal node of the context tree, ir(w) is 
necessarily given by Formulae ()B.6p . (jB.ip . ()B.2p . (|B.3|) and ()B.4p . Since the cylinders 
Civ, w G W span the cr-algebra on C, there is at most one stationary probability 
measure. This proves the only if part of (i.a), the unicity and the characterization 
claimed in (i.b). 

Reciprocally, when the series converge, Formulae (|B.6p . (jB.ip . ()B.2p (jB.3|) . ()B.4p define a 
probability measure on the semiring spanned by cylinders, which extends to a stationary 
probability measure on the whole <7-algebra on C This proves the if part of (i.a). 

To deal with the reducible cases, recall the three following equations (which hold when 
the series converge) : 

l-vr(0 00 )- 7r(l 00 ) = 7r(10)(9i + e 2 ) 
7r(0°°) = tt(0) - tt(10)Gi 
7r(l°°) = tt(1) - vr(10)e 2 

(ii) Assume that Qo°°(0) = 1 and gioo(l) ^ 1 . First, as above, q\oo{\) 7^ 1 implies 7r(l°°) = 
0. Next, Formula (|B.5P is always valid so that the divergence of at least one of the series 
forces 7r(10) to vanish. This gives 7r(0°°) = 1. With the assumption q oo(0) = 1, one 
immediately sees that this trivial probability is stationary, proving (ii.a). 

To prove (ii.b), assume furthermore that the series 0i and 2 converge and let a £ [0, 1]. 
As before, any stationary probability measure ir is completely determined by vr(0) and 
7r(10). As above, vr(l°°) = and if we fix 7r(0°°) = a, the system (jl]) reduces to 

I- a = vr(10)(9i + 9 2 ) 
a = tt(0) - tt(10)Gi 

This gives the characterisation of (ii.b). Formulae (jB.ip . (|B.2p (|B.3|) . (|B.4|) standardly 
extend Tr a to the whole a-algebra on C and ir a is clearly stationary. 
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(ii.c) Assume that go°°(0) = 1 and gioo(l) = 1 . As previously, Formula (|B.5P is valid so 
that the divergence of at least one of the series forces 7r(10) to vanish. Let a G [0, 1] and 
fix 7r(0°°) = a, the system (0) reduces to 7r(0°°) = vr(0) = a and 7r(l°°) = vr(l) = 1 - a. 
The invariance of this measure may be easily checked. 

To prove (ii.d), assume furthermore that the series ©i and 02 converge and let a G [0, 1] 
and b G [0, 1]. If we fix 7r(0°°) = a and 7r(l°°) = b, the system Q is equivalent to 

vr(0) - vr(10)6i = a 
7r(0) + vr(10)e 2 = 1-6 

As 0i > 1 and ©2 > 1, this system has a unique solution given by 

q0 2 + (1 - 6)0i 1-a-b 
7r O)fe (0) = and 7r O)6 (10) = ■ 

fcJ 1 + U 2 fc>l + fc>2 
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